IBIS Macromodel Task Group Meeting date: 04 February 2020 Members (asterisk for those attending): ANSYS: Dan Dvorscak * Curtis Clark Cadence Design Systems: * Ambrish Varma Ken Willis Kumar Keshavan Intel: * Michael Mirmak Keysight Technologies: * Fangyi Rao * Radek Biernacki * Ming Yan Todd Bermensolo Marvell * Steve Parker Mentor, A Siemens Business: * Arpad Muranyi Micron Technology: * Randy Wolff * Justin Butterfield SiSoft (Mathworks): * Walter Katz Mike LaBonte SPISim: * Wei-hsing Huang Teraspeed Labs: * Bob Ross The meeting was led by Arpad Muranyi. Curtis Clark took the minutes. -------------------------------------------------------------------------------- Opens: - Walter noted two new topics that could be added to the agenda: 1. Walter had a presentation on a proposed 56GB PAM4 BCI protocol. 2. Michael M. would like discussions based on a Keysight and Achronix paper from the DesignCon IBIS summit. The topic is various measurements and metrics for AMI Models or EDA tools to provide. - Michael M. asked if we had settled the issues with regard to AMI String parameters and empty strings. Arpad noted that this had been settled at the previous meeting, but it's possible a clarification BIRD could help address the original confusion. Michael M. said he would review the previous minutes and decide if he wanted to introduce a new BIRD. ------------- Review of ARs: - None. -------------------------- Call for patent disclosure: - None. ------------------------- Review of Meeting Minutes: Arpad asked for any comments or corrections to the minutes of the January 21 meeting. Walter moved to approve the minutes. Randy seconded the motion. There were no objections. ------------- New Discussion: BIRD201: Arpad asked if there was anything to discuss and noted that the BIRD was introduced at the previous Open Forum meeting. Walter said people should review it and ask questions if any further clarifications are needed. If there are no requests for further clarification, then we could schedule a vote in the Open Forum. DDR Clock Forwarding: Fangyi reviewed "Modeling DDR Clock Forwarding with a New AMI_GetWave API", which was an abridged version of a presentation from the DesignCon IBIS Summit. slide 2: DFE and Clock Forwarding Architectures in DDR5 Data Buffer - Block diagram - DQ Rx uses a forwarded clock to control the slicer - DQ signal -> op amp -> summer -> slicer - slicer controlled by DQS strobe slide 3: Proposed New GetWave API to Model Clock Forwarding - New function AMI_GetWave2() - Implemented by the DQ Rx model - two waveform inputs - wave - this is the existing waveform argument and is used for the DQ signal - input and output - wave_ref - this second waveform argument is for the forwarded clock - only an input - takes the input strobe waveform, which is the output waveform from a DQS Rx Model's AMI GetWave(). - same size as wave - DQ Rx Model can process data and strobe waves internally - If the model processes the strobe, it would typically be the controller's DQ Rx model. - clock_times - output clock ticks generated by DQ Rx model based on wave_ref Arpad said he thought the name "wave_ref" was a bit confusing. He said it had first made him think of the reference side of the op amp. Fangyi said he understood Arpad's concern and would choose a different name for the final proposal (BIRD). slide 4: Time Domain Simulation Flow - Block diagram - A single DQS serves multiple DQ lines. slide 5: Example using this new API to model the Controller DQ Rx (Read Cycle) - Block diagram of the Controller DQ Rx Model - wave_ref input to the DQ Rx DLL is the DQS Rx output waveform from a separate DQS Rx DLL. - Block diagram of DQ Rx - wave -> VGA -> Gain compression -> CTLE -> DFE (slicer clocked by DQS) - wave_ref -> Phase Interpolator (PI) -> Parasitic low-pass filter -> slicer - Model can process both the data and strobe signals - Model can dynamically tune the PI to adjust strobe skew - Model can perform Adaptive DFE - PI training and DFE adaptation stop after Ignore_Bits slide 6: Phase Interpolator Review - Input waveform -> applies weighted sum of two delays. - Weighted sum controlled by integer index n = 0...N - Performance of PI is measured by the linearity of the relationship between n and the output delay. slide 7: PI training in the Controller DQ Rx Model - Model can internally train the PI to adjust skew for optimal DFE clocking - Output DQ Eye diagrams shown with and without PI training - DQS waveform shown before and after PI - Data and strobe are center-aligned after the PI slide 8: Example DRAM DQ Rx Model (Write Cycle) - Block diagram of the DRAM DQ Rx Model - wave_ref input to the DQ Rx DLL is the DQS Rx output waveform from a separate DQS Rx DLL. - DQ to DQS skew in a real system is optimized by the Controller on the transmit side during write leveling training at startup time. - DRAM Rx model directly uses the DQS input waveform. - Block diagram of DQ Rx - wave -> VGA -> Gain compression -> DFE (slicer clocked by DQS) - wave_ref -> slicer - Possibility for the model maker to do extra processing of the DQS if needed, e.g., to add internal delay to the strobe. slide 9: Jitter Tracking and Unmatched IO Rx - Correlated jitter in DQ and DQS can be tracked in DQ Rx by clock forwarding - DDR4 (and earlier) DRAMs pad the DQ path to match the DQS path - DQS path has more logic - DDR5 supports unmatched DQ and DQS Rx on controller and DRAM sides - Unmatched reduces DQ-DQS jitter correlation - Adversely affects jitter tracking and DFE slide 10: Jitter Tracking and Unmatched IO Rx cont. - Left 2 eye diagrams - DQ input and output with no Tx SJ - Middle 2 eye diagrams - DQ input and output with Tx SJ and no DQ-DQS delay - Right 2 eye diagrams - DQ input and output with Tx SJ and 5UI DQ-DQS delay - further reduction in eye opening due to unmatched paths - New API can model these effects. Walter asked about the DRAM DQ Rx Model example (slide 8). He asked if both the "DQS Rx input waveform" and the "DQ Rx input data waveform" would be waveforms measured at the DRAM during a wide bus simulation containing crosstalk between everything. Fangyi said they would. Walter suggested it might be easier for the EDA tool if a single DQ Rx Model took the raw DQS waveform as well, instead of having a separate Model handling the raw DQS waveform. Fangyi noted that the DQS is shared by multiple DQ lines. He noted that DQS Rx and DQ Rx are two different buffers and asked why you'd want to combine them into one model. Walter said he thought it might be much easier to get the timing right at the slicer if it were all handled by the DQ Rx model maker, but he noted that this was a detail issue for the future and he would reserve judgement until we saw Fangyi's final BIRD. Michael M. asked if this presentation described a completely unified flow for DQ and DQS in which the model maker had to create a more complicated model, but the flow for the tool was straightforward. He asked if an alternative would be to have independent flows for DQ and DQS, perhaps two independent simulations, which would be simpler for the model but harder for the tools. Fangyi asked why you would want two different simulations when DQ and DQS transmit at the same time and had cross talk between them. He noted that in his proposal the tool would process the block of DQS data first. It would then feed the output waveform from the DQS model and the DQ data waveform to the DQ Rx model. Michael M. asked if the model maker might run multiple DQS simulations to determine corner cases and parameter values to characterize the DQS. Then these "statistical" characterization parameters could be provided along with the DQ Rx model for use in a DQ Rx simulation in statistical or time domain. He asked if this might be an alternative to requiring the user to set up a single comprehensive simulation of DQ and DQS. Arpad noted that the presentation only discussed the time domain bit-by-bit flow, and asked if there was any thought of statistical flow. He noted another presentation from the summit in which someone had talked about positioning the UI based on the IR. He asked if DQS clocking effects could be handled in a similar way in a statistical simulation. Fangyi noted that he had only been focused on bit-by-bit flow, and had been thinking of leaving AMI_Init() unchanged. He noted that perhaps we could consider an analogous approach for statistical flow and add a second IR to the AMI_Init() to capture the IR of the DQS path, but he said had not thought about it carefully yet. Michael M. noted that with the current version of AMI_Init() we have the use of various jitter parameters that are assumed to be buffer specific as opposed to including system effects. He wondered if we could get creative with In and Out type parameters to allow system effects to be folded into these statistical parameters. There might be a tradeoff between accuracy and ease of simulation flow that we could consider. Walter noted that for the write cycle a two-simulation approach could work. You could do one simulation to get the timing and determine the skew for each DQ. Then you could re-run statistical or time domain simulations with the skew values because they would determine where DQ <-> DQS crosstalk would occur. Ambrish noted that he had demonstrated an approach that used the clock_times array argument as an input to pass clock ticks into the model. He asked why that approach couldn't be used to pass the DQS model's output clock ticks into the DQ model as an input. Fangyi said the DQ model needs the clock waveform as an input in order to properly model the non-linearities in the PI. Ambrish asked if you'd only need this for the read cycle (where the controller adjusts the PI). Fangyi said the new API would handle both directions. He said that in the physical system the Rx processed the DQS waveform. He noted that each DQ lane would have its own PI and train for its own skew. So this processing can't be done in the single DQS model shared by multiple DQ lines. Arpad asked what the next step was. Fangyi said he would draft a BIRD. He noted that his initial BIRD would only consider time-domain flow. Generic Tx BCI Protocol: Walter briefly reviewed the DDR5 DQ Write Protocol he had presented at DesignCon and developed during work on BIRD201. It was designed for a flow in which the Tx controls the optimization. He noted that the other scenario is when the Rx controls the Tx, as in a typical 56GB PAM4 system. He noted that Marvell and Cadence had presented a 56GB PAM4 paper at DesignCon based on a simple protocol that involved incrementing and decrementing Tx taps. Walter reviewed a proposal he had sent to ATM the day before the meeting for a Generic Tx BCI protocol. slide 2: General Flow Summary slide 3: What the Tx tells the Rx the First Time - How many pre and post cursor taps - Granularity, min, max values of the taps slide 4: What does the Rx tell the Tx Every time - Tap settings - This example uses "Taps_Register" - provide the tap number and its register setting - Note: initial presentation contains extra "0" values in the tap settings on slides 4 and 5. These typos will be corrected in the next version. slide 5: Alternatives for Rx setting the Tx - Taps_Register - Rx specifies the tap's register value - Taps_Value - Rx specifies the tap's value - Taps_Increment - Rx specifies increment/decrement slide 6: What Tx tells the Rx After the First Time - Tx reports back what it did in response to Rx command - This feedback is required in case it hits a limit or somehow can't honor the request Walter noted that the WhoAmI and Sequence values (shown in all examples) are to ensure timing issues and file I/O don't corrupt the sequence of commands going back and forth between the Tx and Rx. Ambrish asked if Walter was proposing a generic Tx protocol that would be IBIS approved and be controlled by IBIS. Walter said this was to be determined and noted that IBIS had never formalized a scheme for officially sanctioned BCI protocols. Walter noted that for this example protocol, a compliant Rx wouldn't have to use every possible option (setting register values, tap values, or incrementing), but a compliant Tx should support all of them. Arpad said he could envision a section of the IBIS webpage devoted to BCI protocols, much like BIRDs and BUGs had their own page. We could post protocols and assign them official numbers. Walter said that would be a good step forward. - Michael M.: Motion to adjourn. - Walter: Second. - Arpad: Thank you all for joining. ------------- Next meeting: 11 February 2020 12:00pm PT ------------- IBIS Interconnect SPICE Wish List: 1) Simulator directives